Usenet 1993 July

home *** CD-ROM | disk | FTP | other *** search

/ Usenet 1993 July / InfoMagic USENET CD-ROM July 1993.ISO / sources / unix / volume23 / flex2.3 / part08 < prev next >

Wrap

Internet Message Format | 1990-10-10 | 51.3 KB

Subject: v23i044: Flex, a fast lex replacement, Part08/10 Newsgroups: comp.sources.unix Approved: rsalz@uunet.UU.NET X-Checksum-Snefru: ca66ccd4 e23a0f71 33039016 677e8981 Submitted-by: Vern Paxson <vern@cs.cornell.edu> Posting-number: Volume 23, Issue 44 Archive-name: flex2.3/part08 #! /bin/sh # This is a shell archive. Remove anything before this line, then feed it # into a shell via "sh file" or similar. To overwrite existing files, # type "sh file -c". # The tool that generated this appeared in the comp.sources.unix newsgroup; # send mail to comp-sources-unix@uunet.uu.net if you want that tool. # Contents: flexdoc.1.02 misc.c nfa.c # Wrapped by rsalz@litchi.bbn.com on Wed Oct 10 13:24:04 1990 PATH=/bin:/usr/bin:/usr/ucb ; export PATH echo If this archive is complete, you will see the following message: echo ' "shar: End of archive 8 (of 10)."' if test -f 'flexdoc.1.02' -a "${1}" != "-c" ; then echo shar: Will not clobber existing file \"'flexdoc.1.02'\" else echo shar: Extracting \"'flexdoc.1.02'\" $15514 characters$ sed "s/^X//" >'flexdoc.1.02' <<'END_OF_FILE' XAnother area where the user can increase a scanner's performance X(and one that's easier to implement) arises from the fact that Xthe longer the tokens matched, the faster the scanner will run. XThis is because with long tokens the processing of most input Xcharacters takes place in the (short) inner scanning loop, and Xdoes not often have to go through the additional work of setting up Xthe scanning environment (e.g., X.B yytext) Xfor the action. Recall the scanner for C comments: X.nf X X %x comment X %% X int line_num = 1; X X "/*" BEGIN(comment); X X <comment>[^*\\n]* X <comment>"*"+[^*/\\n]* X <comment>\\n ++line_num; X <comment>"*"+"/" BEGIN(INITIAL); X X.fi XThis could be sped up by writing it as: X.nf X X %x comment X %% X int line_num = 1; X X "/*" BEGIN(comment); X X <comment>[^*\\n]* X <comment>[^*\\n]*\\n ++line_num; X <comment>"*"+[^*/\\n]* X <comment>"*"+[^*/\\n]*\\n ++line_num; X <comment>"*"+"/" BEGIN(INITIAL); X X.fi XNow instead of each newline requiring the processing of another Xaction, recognizing the newlines is "distributed" over the other rules Xto keep the matched text as long as possible. Note that X.I adding Xrules does X.I not Xslow down the scanner! The speed of the scanner is independent Xof the number of rules or (modulo the considerations given at the Xbeginning of this section) how complicated the rules are with Xregard to operators such as '*' and '|'. X.LP XA final example in speeding up a scanner: suppose you want to scan Xthrough a file containing identifiers and keywords, one per line Xand with no other extraneous characters, and recognize all the Xkeywords. A natural first approach is: X.nf X X %% X asm | X auto | X break | X ... etc ... X volatile | X while /* it's a keyword */ X X .|\\n /* it's not a keyword */ X X.fi XTo eliminate the back-tracking, introduce a catch-all rule: X.nf X X %% X asm | X auto | X break | X ... etc ... X volatile | X while /* it's a keyword */ X X [a-z]+ | X .|\\n /* it's not a keyword */ X X.fi XNow, if it's guaranteed that there's exactly one word per line, Xthen we can reduce the total number of matches by a half by Xmerging in the recognition of newlines with that of the other Xtokens: X.nf X X %% X asm\\n | X auto\\n | X break\\n | X ... etc ... X volatile\\n | X while\\n /* it's a keyword */ X X [a-z]+\\n | X .|\\n /* it's not a keyword */ X X.fi XOne has to be careful here, as we have now reintroduced backtracking Xinto the scanner. In particular, while X.I we Xknow that there will never be any characters in the input stream Xother than letters or newlines, X.I flex Xcan't figure this out, and it will plan for possibly needing backtracking Xwhen it has scanned a token like "auto" and then the next character Xis something other than a newline or a letter. Previously it would Xthen just match the "auto" rule and be done, but now it has no "auto" Xrule, only a "auto\\n" rule. To eliminate the possibility of backtracking, Xwe could either duplicate all rules but without final newlines, or, Xsince we never expect to encounter such an input and therefore don't Xhow it's classified, we can introduce one more catch-all rule, this Xone which doesn't include a newline: X.nf X X %% X asm\\n | X auto\\n | X break\\n | X ... etc ... X volatile\\n | X while\\n /* it's a keyword */ X X [a-z]+\\n | X [a-z]+ | X .|\\n /* it's not a keyword */ X X.fi XCompiled with X.B -Cf, Xthis is about as fast as one can get a X.I flex Xscanner to go for this particular problem. X.LP XA final note: X.I flex Xis slow when matching NUL's, particularly when a token contains Xmultiple NUL's. XIt's best to write rules which match X.I short Xamounts of text if it's anticipated that the text will often include NUL's. X.SH INCOMPATIBILITIES WITH LEX AND POSIX X.I flex Xis a rewrite of the Unix X.I lex Xtool (the two implementations do not share any code, though), Xwith some extensions and incompatibilities, both of which Xare of concern to those who wish to write scanners acceptable Xto either implementation. At present, the POSIX X.I lex Xdraft is Xvery close to the original X.I lex Ximplementation, so some of these Xincompatibilities are also in conflict with the POSIX draft. But Xthe intent is that except as noted below, X.I flex Xas it presently stands will Xultimately be POSIX conformant (i.e., that those areas of conflict with Xthe POSIX draft will be resolved in X.I flex's Xfavor). Please bear in Xmind that all the comments which follow are with regard to the POSIX X.I draft Xstandard of Summer 1989, and not the final document (or subsequent Xdrafts); they are included so X.I flex Xusers can be aware of the standardization issues and those areas where X.I flex Xmay in the near future undergo changes incompatible with Xits current definition. X.LP X.I flex Xis fully compatible with X.I lex Xwith the following exceptions: X.IP - XThe undocumented X.I lex Xscanner internal variable X.B yylineno Xis not supported. It is difficult to support this option efficiently, Xsince it requires examining every character scanned and reexamining Xthe characters when the scanner backs up. XThings get more complicated when the end of buffer or file is reached or a XNUL is scanned (since the scan must then be restarted with the proper line Xnumber count), or the user uses the yyless(), unput(), or REJECT actions, Xor the multiple input buffer functions. X.IP XThe fix is to add rules which, upon seeing a newline, increment Xyylineno. This is usually an easy process, though it can be a drag if some Xof the patterns can match multiple newlines along with other characters. X.IP Xyylineno is not part of the POSIX draft. X.IP - XThe X.B input() Xroutine is not redefinable, though it may be called to read characters Xfollowing whatever has been matched by a rule. If X.B input() Xencounters an end-of-file the normal X.B yywrap() Xprocessing is done. A ``real'' end-of-file is returned by X.B input() Xas X.I EOF. X.IP XInput is instead controlled by redefining the X.B YY_INPUT Xmacro. X.IP XThe X.I flex Xrestriction that X.B input() Xcannot be redefined is in accordance with the POSIX draft, but X.B YY_INPUT Xhas not yet been accepted into the draft (and probably won't; it looks Xlike the draft will simply not specify any way of controlling the Xscanner's input other than by making an initial assignment to X.I yyin). X.IP - X.I flex Xscanners do not use stdio for input. Because of this, when writing an Xinteractive scanner one must explicitly call fflush() on the Xstream associated with the terminal after writing out a prompt. XWith X.I lex Xsuch writes are automatically flushed since X.I lex Xscanners use X.B getchar() Xfor their input. Also, when writing interactive scanners with X.I flex, Xthe X.B -I Xflag must be used. X.IP - X.I flex Xscanners are not as reentrant as X.I lex Xscanners. In particular, if you have an interactive scanner and Xan interrupt handler which long-jumps out of the scanner, and Xthe scanner is subsequently called again, you may get the following Xmessage: X.nf X X fatal flex scanner internal error--end of buffer missed X X.fi XTo reenter the scanner, first use X.nf X X yyrestart( yyin ); X X.fi X.IP - X.B output() Xis not supported. XOutput from the X.B ECHO Xmacro is done to the file-pointer X.I yyout X(default X.I stdout). X.IP XThe POSIX draft mentions that an X.B output() Xroutine exists but currently gives no details as to what it does. X.IP - X.I lex Xdoes not support exclusive start conditions (%x), though they Xare in the current POSIX draft. X.IP - XWhen definitions are expanded, X.I flex Xencloses them in parentheses. XWith lex, the following: X.nf X X NAME [A-Z][A-Z0-9]* X %% X foo{NAME}? printf( "Found it\\n" ); X %% X X.fi Xwill not match the string "foo" because when the macro Xis expanded the rule is equivalent to "foo[A-Z][A-Z0-9]*?" Xand the precedence is such that the '?' is associated with X"[A-Z0-9]*". With X.I flex, Xthe rule will be expanded to X"foo([A-Z][A-Z0-9]*)?" and so the string "foo" will match. XNote that because of this, the X.B ^, $, <s>, /, Xand X.B <<EOF>> Xoperators cannot be used in a X.I flex Xdefinition. X.IP XThe POSIX draft interpretation is the same as X.I flex's. X.IP - XTo specify a character class which matches anything but a left bracket (']'), Xin X.I lex Xone can use "[^]]" but with X.I flex Xone must use "[^\\]]". The latter works with X.I lex, Xtoo. X.IP - XThe X.I lex X.B %r X(generate a Ratfor scanner) option is not supported. It is not part Xof the POSIX draft. X.IP - XIf you are providing your own yywrap() routine, you must include a X"#undef yywrap" in the definitions section (section 1). Note that Xthe "#undef" will have to be enclosed in %{}'s. X.IP XThe POSIX draft Xspecifies that yywrap() is a function and this is very unlikely to change; so X.I flex users are warned Xthat X.B yywrap() Xis likely to be changed to a function in the near future. X.IP - XAfter a call to X.B unput(), X.I yytext Xand X.I yyleng Xare undefined until the next token is matched. This is not the case with X.I lex Xor the present POSIX draft. X.IP - XThe precedence of the X.B {} X(numeric range) operator is different. X.I lex Xinterprets "abc{1,3}" as "match one, two, or Xthree occurrences of 'abc'", whereas X.I flex Xinterprets it as "match 'ab' Xfollowed by one, two, or three occurrences of 'c'". The latter is Xin agreement with the current POSIX draft. X.IP - XThe precedence of the X.B ^ Xoperator is different. X.I lex Xinterprets "^foo|bar" as "match either 'foo' at the beginning of a line, Xor 'bar' anywhere", whereas X.I flex Xinterprets it as "match either 'foo' or 'bar' if they come at the beginning Xof a line". The latter is in agreement with the current POSIX draft. X.IP - XTo refer to yytext outside of the scanner source file, Xthe correct definition with X.I flex Xis "extern char *yytext" rather than "extern char yytext[]". XThis is contrary to the current POSIX draft but a point on which X.I flex Xwill not be changing, as the array representation entails a Xserious performance penalty. It is hoped that the POSIX draft will Xbe emended to support the X.I flex Xvariety of declaration (as this is a fairly painless change to Xrequire of X.I lex Xusers). X.IP - X.I yyin Xis X.I initialized Xby X.I lex Xto be X.I stdin; X.I flex, Xon the other hand, Xinitializes X.I yyin Xto NULL Xand then X.I assigns Xit to X.I stdin Xthe first time the scanner is called, providing X.I yyin Xhas not already been assigned to a non-NULL value. The difference is Xsubtle, but the net effect is that with X.I flex Xscanners, X.I yyin Xdoes not have a valid value until the scanner has been called. X.IP - XThe special table-size declarations such as X.B %a Xsupported by X.I lex Xare not required by X.I flex Xscanners; X.I flex Xignores them. X.IP - XThe name X.bd XFLEX_SCANNER Xis #define'd so scanners may be written for use with either X.I flex Xor X.I lex. X.LP XThe following X.I flex Xfeatures are not included in X.I lex Xor the POSIX draft standard: X.nf X X yyterminate() X <<EOF>> X YY_DECL X #line directives X %{}'s around actions X yyrestart() X comments beginning with '#' (deprecated) X multiple actions on a line X X.fi XThis last feature refers to the fact that with X.I flex Xyou can put multiple actions on the same line, separated with Xsemi-colons, while with X.I lex, Xthe following X.nf X X foo handle_foo(); ++num_foos_seen; X X.fi Xis (rather surprisingly) truncated to X.nf X X foo handle_foo(); X X.fi X.I flex Xdoes not truncate the action. Actions that are not enclosed in Xbraces are simply terminated at the end of the line. X.SH DIAGNOSTICS X.I reject_used_but_not_detected undefined Xor X.I yymore_used_but_not_detected undefined - XThese errors can occur at compile time. They indicate that the Xscanner uses X.B REJECT Xor X.B yymore() Xbut that X.I flex Xfailed to notice the fact, meaning that X.I flex Xscanned the first two sections looking for occurrences of these actions Xand failed to find any, but somehow you snuck some in (via a #include Xfile, for example). Make an explicit reference to the action in your X.I flex Xinput file. (Note that previously X.I flex Xsupported a X.B %used/%unused Xmechanism for dealing with this problem; this feature is still supported Xbut now deprecated, and will go away soon unless the author hears from Xpeople who can argue compellingly that they need it.) X.LP X.I flex scanner jammed - Xa scanner compiled with X.B -s Xhas encountered an input string which wasn't matched by Xany of its rules. X.LP X.I flex input buffer overflowed - Xa scanner rule matched a string long enough to overflow the Xscanner's internal input buffer (16K bytes by default - controlled by X.B YY_BUF_SIZE Xin "flex.skel". Note that to redefine this macro, you must first X.B #undefine Xit). X.LP X.I scanner requires -8 flag - XYour scanner specification includes recognizing 8-bit characters and Xyou did not specify the -8 flag (and your site has not installed flex Xwith -8 as the default). X.LP X.I Xfatal flex scanner internal error--end of buffer missed - XThis can occur in an scanner which is reentered after a long-jump Xhas jumped out (or over) the scanner's activation frame. Before Xreentering the scanner, use: X.nf X X yyrestart( yyin ); X X.fi X.LP X.I too many %t classes! - XYou managed to put every single character into its own %t class. X.I flex Xrequires that at least one of the classes share characters. X.SH DEFICIENCIES / BUGS XSee flex(1). X.SH "SEE ALSO" X.LP Xflex(1), lex(1), yacc(1), sed(1), awk(1). X.LP XM. E. Lesk and E. Schmidt, X.I LEX - Lexical Analyzer Generator X.SH AUTHOR XVern Paxson, with the help of many ideas and much inspiration from XVan Jacobson. Original version by Jef Poskanzer. The fast table Xrepresentation is a partial implementation of a design done by Van XJacobson. The implementation was done by Kevin Gong and Vern Paxson. X.LP XThanks to the many X.I flex Xbeta-testers, feedbackers, and contributors, especially Casey XLeedom, benson@odi.com, Keith Bostic, XFrederic Brehm, Nick Christopher, Jason Coughlin, XScott David Daniels, Leo Eskin, XChris Faylor, Eric Goldman, Eric XHughes, Jeffrey R. Jones, Kevin B. Kenny, Ronald Lamprecht, XGreg Lee, Craig Leres, Mohamed el Lozy, Jim Meyering, Marc Nozell, Esmond Pitt, XJef Poskanzer, Jim Roskind, XDave Tallman, Frank Whaley, Ken Yap, and those whose names Xhave slipped my marginal mail-archiving skills but whose contributions Xare appreciated all the same. X.LP XThanks to Keith Bostic, John Gilmore, Craig Leres, Bob XMulcahy, Rich Salz, and Richard Stallman for help with various distribution Xheadaches. X.LP XThanks to Esmond Pitt and Earle Horton for 8-bit character support; Xto Benson Margulies and Fred XBurke for C++ support; to Ove Ewerlid for the basics of support for XNUL's; and to Eric Hughes for the basics of support for multiple buffers. X.LP XWork is being done on extending X.I flex Xto generate scanners in which the Xstate machine is directly represented in C code rather than tables. XThese scanners may well be substantially faster than those generated Xusing -f or -F. If you are working in this area and are interested Xin comparing notes and seeing whether redundant work can be avoided, Xcontact Ove Ewerlid (ewerlid@mizar.DoCS.UU.SE). X.LP XThis work was primarily done when I was at the Real Time Systems Group Xat the Lawrence Berkeley Laboratory in Berkeley, CA. Many thanks to all there Xfor the support I received. X.LP XSend comments to: X.nf X X Vern Paxson X Computer Science Department X 4126 Upson Hall X Cornell University X Ithaca, NY 14853-7501 X X vern@cs.cornell.edu X decvax!cornell!vern X X.fi END_OF_FILE if test 15514 -ne `wc -c <'flexdoc.1.02'`; then echo shar: \"'flexdoc.1.02'\" unpacked with wrong size! fi # end of 'flexdoc.1.02' fi if test -f 'misc.c' -a "${1}" != "-c" ; then echo shar: Will not clobber existing file \"'misc.c'\" else echo shar: Extracting \"'misc.c'\" $14912 characters$ sed "s/^X//" >'misc.c' <<'END_OF_FILE' X/* misc - miscellaneous flex routines */ X X/*- X * Copyright (c) 1990 The Regents of the University of California. X * All rights reserved. X * X * This code is derived from software contributed to Berkeley by X * Vern Paxson. X * X * The United States Government has rights in this work pursuant X * to contract no. DE-AC03-76SF00098 between the United States X * Department of Energy and the University of California. X * X * Redistribution and use in source and binary forms are permitted provided X * that: (1) source distributions retain this entire copyright notice and X * comment, and (2) distributions including binaries display the following X * acknowledgement: ``This product includes software developed by the X * University of California, Berkeley and its contributors'' in the X * documentation or other materials provided with the distribution and in X * all advertising materials mentioning features or use of this software. X * Neither the name of the University nor the names of its contributors may X * be used to endorse or promote products derived from this software without X * specific prior written permission. X * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED X * WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF X * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. X */ X X#ifndef lint Xstatic char rcsid[] = X "@(#) $Header: /usr/fsys/odin/a/vern/flex/RCS/misc.c,v 2.9 90/08/14 00:10:24 vern Exp $ (LBL)"; X#endif X X#include <ctype.h> X#include "flexdef.h" X X X/* ANSI C does not guarantee that isascii() is defined */ X#ifndef isascii X#define isascii(c) ((c) <= 0177) X#endif X X X X/* declare functions that have forward references */ X Xvoid dataflush PROTO(()); Xint otoi PROTO((Char [])); X X X/* action_out - write the actions from the temporary file to lex.yy.c X * X * synopsis X * action_out(); X * X * Copies the action file up to %% (or end-of-file) to lex.yy.c X */ X Xvoid action_out() X X { X char buf[MAXLINE]; X X while ( fgets( buf, MAXLINE, temp_action_file ) != NULL ) X if ( buf[0] == '%' && buf[1] == '%' ) X break; X else X fputs( buf, stdout ); X } X X X/* allocate_array - allocate memory for an integer array of the given size */ X Xvoid *allocate_array( size, element_size ) Xint size, element_size; X X { X register void *mem; X X /* on 16-bit int machines (e.g., 80286) we might be trying to X * allocate more than a signed int can hold, and that won't X * work. Cheap test: X */ X if ( element_size * size <= 0 ) X flexfatal( "request for < 1 byte in allocate_array()" ); X X mem = (void *) malloc( (unsigned) (element_size * size) ); X X if ( mem == NULL ) X flexfatal( "memory allocation failed in allocate_array()" ); X X return ( mem ); X } X X X/* all_lower - true if a string is all lower-case X * X * synopsis: X * Char *str; X * int all_lower(); X * true/false = all_lower( str ); X */ X Xint all_lower( str ) Xregister Char *str; X X { X while ( *str ) X { X if ( ! isascii( *str ) || ! islower( *str ) ) X return ( 0 ); X ++str; X } X X return ( 1 ); X } X X X/* all_upper - true if a string is all upper-case X * X * synopsis: X * Char *str; X * int all_upper(); X * true/false = all_upper( str ); X */ X Xint all_upper( str ) Xregister Char *str; X X { X while ( *str ) X { X if ( ! isascii( *str ) || ! isupper( (char) *str ) ) X return ( 0 ); X ++str; X } X X return ( 1 ); X } X X X/* bubble - bubble sort an integer array in increasing order X * X * synopsis X * int v[n], n; X * bubble( v, n ); X * X * description X * sorts the first n elements of array v and replaces them in X * increasing order. X * X * passed X * v - the array to be sorted X * n - the number of elements of 'v' to be sorted */ X Xvoid bubble( v, n ) Xint v[], n; X X { X register int i, j, k; X X for ( i = n; i > 1; --i ) X for ( j = 1; j < i; ++j ) X if ( v[j] > v[j + 1] ) /* compare */ X { X k = v[j]; /* exchange */ X v[j] = v[j + 1]; X v[j + 1] = k; X } X } X X X/* clower - replace upper-case letter to lower-case X * X * synopsis: X * Char clower(); X * int c; X * c = clower( c ); X */ X XChar clower( c ) Xregister int c; X X { X return ( (isascii( c ) && isupper( c )) ? tolower( c ) : c ); X } X X X/* copy_string - returns a dynamically allocated copy of a string X * X * synopsis X * char *str, *copy, *copy_string(); X * copy = copy_string( str ); X */ X Xchar *copy_string( str ) Xregister char *str; X X { X register char *c; X char *copy; X X /* find length */ X for ( c = str; *c; ++c ) X ; X X copy = malloc( (unsigned) ((c - str + 1) * sizeof( char )) ); X X if ( copy == NULL ) X flexfatal( "dynamic memory failure in copy_string()" ); X X for ( c = copy; (*c++ = *str++); ) X ; X X return ( copy ); X } X X X/* copy_unsigned_string - X * returns a dynamically allocated copy of a (potentially) unsigned string X * X * synopsis X * Char *str, *copy, *copy_unsigned_string(); X * copy = copy_unsigned_string( str ); X */ X XChar *copy_unsigned_string( str ) Xregister Char *str; X X { X register Char *c; X Char *copy; X X /* find length */ X for ( c = str; *c; ++c ) X ; X X copy = (Char *) malloc( (unsigned) ((c - str + 1) * sizeof( Char )) ); X X if ( copy == NULL ) X flexfatal( "dynamic memory failure in copy_unsigned_string()" ); X X for ( c = copy; (*c++ = *str++); ) X ; X X return ( copy ); X } X X X/* cshell - shell sort a character array in increasing order X * X * synopsis X * X * Char v[n]; X * int n, special_case_0; X * cshell( v, n, special_case_0 ); X * X * description X * does a shell sort of the first n elements of array v. X * If special_case_0 is true, then any element equal to 0 X * is instead assumed to have infinite weight. X * X * passed X * v - array to be sorted X * n - number of elements of v to be sorted X */ X Xvoid cshell( v, n, special_case_0 ) XChar v[]; Xint n, special_case_0; X X { X int gap, i, j, jg; X Char k; X X for ( gap = n / 2; gap > 0; gap = gap / 2 ) X for ( i = gap; i < n; ++i ) X for ( j = i - gap; j >= 0; j = j - gap ) X { X jg = j + gap; X X if ( special_case_0 ) X { X if ( v[jg] == 0 ) X break; X X else if ( v[j] != 0 && v[j] <= v[jg] ) X break; X } X X else if ( v[j] <= v[jg] ) X break; X X k = v[j]; X v[j] = v[jg]; X v[jg] = k; X } X } X X X/* dataend - finish up a block of data declarations X * X * synopsis X * dataend(); X */ X Xvoid dataend() X X { X if ( datapos > 0 ) X dataflush(); X X /* add terminator for initialization */ X puts( " } ;\n" ); X X dataline = 0; X datapos = 0; X } X X X X/* dataflush - flush generated data statements X * X * synopsis X * dataflush(); X */ X Xvoid dataflush() X X { X putchar( '\n' ); X X if ( ++dataline >= NUMDATALINES ) X { X /* put out a blank line so that the table is grouped into X * large blocks that enable the user to find elements easily X */ X putchar( '\n' ); X dataline = 0; X } X X /* reset the number of characters written on the current line */ X datapos = 0; X } X X X/* flexerror - report an error message and terminate X * X * synopsis X * char msg[]; X * flexerror( msg ); X */ X Xvoid flexerror( msg ) Xchar msg[]; X X { X fprintf( stderr, "%s: %s\n", program_name, msg ); X X flexend( 1 ); X } X X X/* flexfatal - report a fatal error message and terminate X * X * synopsis X * char msg[]; X * flexfatal( msg ); X */ X Xvoid flexfatal( msg ) Xchar msg[]; X X { X fprintf( stderr, "%s: fatal internal error, %s\n", program_name, msg ); X flexend( 1 ); X } X X X/* flex_gettime - return current time X * X * synopsis X * char *flex_gettime(), *time_str; X * time_str = flex_gettime(); X * X * note X * the routine name has the "flex_" prefix because of name clashes X * with Turbo-C X */ X X/* include sys/types.h to use time_t and make lint happy */ X X#ifndef MS_DOS X#ifndef VMS X#include <sys/types.h> X#else X#include <types.h> X#endif X#endif X X#ifdef MS_DOS X#include <time.h> Xtypedef long time_t; X#endif X Xchar *flex_gettime() X X { X time_t t, time(); X char *result, *ctime(), *copy_string(); X X t = time( (long *) 0 ); X X result = copy_string( ctime( &t ) ); X X /* get rid of trailing newline */ X result[24] = '\0'; X X return ( result ); X } X X X/* lerrif - report an error message formatted with one integer argument X * X * synopsis X * char msg[]; X * int arg; X * lerrif( msg, arg ); X */ X Xvoid lerrif( msg, arg ) Xchar msg[]; Xint arg; X X { X char errmsg[MAXLINE]; X (void) sprintf( errmsg, msg, arg ); X flexerror( errmsg ); X } X X X/* lerrsf - report an error message formatted with one string argument X * X * synopsis X * char msg[], arg[]; X * lerrsf( msg, arg ); X */ X Xvoid lerrsf( msg, arg ) Xchar msg[], arg[]; X X { X char errmsg[MAXLINE]; X X (void) sprintf( errmsg, msg, arg ); X flexerror( errmsg ); X } X X X/* htoi - convert a hexadecimal digit string to an integer value X * X * synopsis: X * int val, htoi(); X * Char str[]; X * val = htoi( str ); X */ X Xint htoi( str ) XChar str[]; X X { X int result; X X (void) sscanf( (char *) str, "%x", &result ); X X return ( result ); X } X X X/* is_hex_digit - returns true if a character is a valid hex digit, false X * otherwise X * X * synopsis: X * int true_or_false, is_hex_digit(); X * int ch; X * val = is_hex_digit( ch ); X */ X Xint is_hex_digit( ch ) Xint ch; X X { X if ( isdigit( ch ) ) X return ( 1 ); X X switch ( clower( ch ) ) X { X case 'a': X case 'b': X case 'c': X case 'd': X case 'e': X case 'f': X return ( 1 ); X X default: X return ( 0 ); X } X } X X X/* line_directive_out - spit out a "# line" statement */ X Xvoid line_directive_out( output_file_name ) XFILE *output_file_name; X X { X if ( infilename && gen_line_dirs ) X fprintf( output_file_name, "# line %d \"%s\"\n", linenum, infilename ); X } X X X/* mk2data - generate a data statement for a two-dimensional array X * X * synopsis X * int value; X * mk2data( value ); X * X * generates a data statement initializing the current 2-D array to "value" X */ Xvoid mk2data( value ) Xint value; X X { X if ( datapos >= NUMDATAITEMS ) X { X putchar( ',' ); X dataflush(); X } X X if ( datapos == 0 ) X /* indent */ X fputs( " ", stdout ); X X else X putchar( ',' ); X X ++datapos; X X printf( "%5d", value ); X } X X X/* mkdata - generate a data statement X * X * synopsis X * int value; X * mkdata( value ); X * X * generates a data statement initializing the current array element to X * "value" X */ Xvoid mkdata( value ) Xint value; X X { X if ( datapos >= NUMDATAITEMS ) X { X putchar( ',' ); X dataflush(); X } X X if ( datapos == 0 ) X /* indent */ X fputs( " ", stdout ); X X else X putchar( ',' ); X X ++datapos; X X printf( "%5d", value ); X } X X X/* myctoi - return the integer represented by a string of digits X * X * synopsis X * Char array[]; X * int val, myctoi(); X * val = myctoi( array ); X * X */ X Xint myctoi( array ) XChar array[]; X X { X int val = 0; X X (void) sscanf( (char *) array, "%d", &val ); X X return ( val ); X } X X X/* myesc - return character corresponding to escape sequence X * X * synopsis X * Char array[], c, myesc(); X * c = myesc( array ); X * X */ X XChar myesc( array ) XChar array[]; X X { X Char c, esc_char; X register int sptr; X X switch ( array[1] ) X { X case 'a': return ( '\a' ); X case 'b': return ( '\b' ); X case 'f': return ( '\f' ); X case 'n': return ( '\n' ); X case 'r': return ( '\r' ); X case 't': return ( '\t' ); X case 'v': return ( '\v' ); X X case '0': X case '1': X case '2': X case '3': X case '4': X case '5': X case '6': X case '7': X case '8': X case '9': X { /* \<octal> */ X sptr = 1; X X while ( isascii( array[sptr] ) && isdigit( array[sptr] ) ) X /* don't increment inside loop control because if X * isdigit() is a macro it might expand into multiple X * increments ... X */ X ++sptr; X X c = array[sptr]; X array[sptr] = '\0'; X X esc_char = otoi( array + 1 ); X X array[sptr] = c; X X return ( esc_char ); X } X X case 'x': X { /* \x<hex> */ X int sptr = 2; X X while ( isascii( array[sptr] ) && is_hex_digit( array[sptr] ) ) X /* don't increment inside loop control because if X * isdigit() is a macro it might expand into multiple X * increments ... X */ X ++sptr; X X c = array[sptr]; X array[sptr] = '\0'; X X esc_char = htoi( array + 2 ); X X array[sptr] = c; X X return ( esc_char ); X } X X default: X return ( array[1] ); X } X } X X X/* otoi - convert an octal digit string to an integer value X * X * synopsis: X * int val, otoi(); X * Char str[]; X * val = otoi( str ); X */ X Xint otoi( str ) XChar str[]; X X { X int result; X X (void) sscanf( (char *) str, "%o", &result ); X X return ( result ); X } X X X/* readable_form - return the the human-readable form of a character X * X * synopsis: X * int c; X * char *readable_form(); X * <string> = readable_form( c ); X * X * The returned string is in static storage. X */ X Xchar *readable_form( c ) Xregister int c; X X { X static char rform[10]; X X if ( (c >= 0 && c < 32) || c >= 127 ) X { X switch ( c ) X { X case '\n': return ( "\\n" ); X case '\t': return ( "\\t" ); X case '\f': return ( "\\f" ); X case '\r': return ( "\\r" ); X case '\b': return ( "\\b" ); X X default: X (void) sprintf( rform, "\\%.3o", c ); X return ( rform ); X } X } X X else if ( c == ' ' ) X return ( "' '" ); X X else X { X rform[0] = c; X rform[1] = '\0'; X X return ( rform ); X } X } X X X/* reallocate_array - increase the size of a dynamic array */ X Xvoid *reallocate_array( array, size, element_size ) Xvoid *array; Xint size, element_size; X X { X register void *new_array; X X /* same worry as in allocate_array(): */ X if ( size * element_size <= 0 ) X flexfatal( "attempt to increase array size by less than 1 byte" ); X X new_array = X (void *) realloc( (char *)array, (unsigned) (size * element_size )); X X if ( new_array == NULL ) X flexfatal( "attempt to increase array size failed" ); X X return ( new_array ); X } X X X/* skelout - write out one section of the skeleton file X * X * synopsis X * skelout(); X * X * DESCRIPTION X * Copies from skelfile to stdout until a line beginning with "%%" or X * EOF is found. X */ Xvoid skelout() X X { X char buf[MAXLINE]; X X while ( fgets( buf, MAXLINE, skelfile ) != NULL ) X if ( buf[0] == '%' && buf[1] == '%' ) X break; X else X fputs( buf, stdout ); X } X X X/* transition_struct_out - output a yy_trans_info structure X * X * synopsis X * int element_v, element_n; X * transition_struct_out( element_v, element_n ); X * X * outputs the yy_trans_info structure with the two elements, element_v and X * element_n. Formats the output with spaces and carriage returns. X */ X Xvoid transition_struct_out( element_v, element_n ) Xint element_v, element_n; X X { X printf( "%7d, %5d,", element_v, element_n ); X X datapos += TRANS_STRUCT_PRINT_LENGTH; X X if ( datapos >= 75 ) X { X putchar( '\n' ); X X if ( ++dataline % 10 == 0 ) X putchar( '\n' ); X X datapos = 0; X } X } END_OF_FILE if test 14912 -ne `wc -c <'misc.c'`; then echo shar: \"'misc.c'\" unpacked with wrong size! fi # end of 'misc.c' fi if test -f 'nfa.c' -a "${1}" != "-c" ; then echo shar: Will not clobber existing file \"'nfa.c'\" else echo shar: Extracting \"'nfa.c'\" $17603 characters$ sed "s/^X//" >'nfa.c' <<'END_OF_FILE' X/* nfa - NFA construction routines */ X X/*- X * Copyright (c) 1990 The Regents of the University of California. X * All rights reserved. X * X * This code is derived from software contributed to Berkeley by X * Vern Paxson. X * X * The United States Government has rights in this work pursuant X * to contract no. DE-AC03-76SF00098 between the United States X * Department of Energy and the University of California. X * X * Redistribution and use in source and binary forms are permitted provided X * that: (1) source distributions retain this entire copyright notice and X * comment, and (2) distributions including binaries display the following X * acknowledgement: ``This product includes software developed by the X * University of California, Berkeley and its contributors'' in the X * documentation or other materials provided with the distribution and in X * all advertising materials mentioning features or use of this software. X * Neither the name of the University nor the names of its contributors may X * be used to endorse or promote products derived from this software without X * specific prior written permission. X * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED X * WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF X * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. X */ X X#ifndef lint Xstatic char rcsid[] = X "@(#) $Header: /usr/fsys/odin/a/vern/flex/RCS/nfa.c,v 2.6 90/06/27 23:48:29 vern Exp $ (LBL)"; X#endif X X#include "flexdef.h" X X X/* declare functions that have forward references */ X Xint dupmachine PROTO((int)); Xvoid mkxtion PROTO((int, int)); X X X/* add_accept - add an accepting state to a machine X * X * synopsis X * X * add_accept( mach, accepting_number ); X * X * accepting_number becomes mach's accepting number. X */ X Xvoid add_accept( mach, accepting_number ) Xint mach, accepting_number; X X { X /* hang the accepting number off an epsilon state. if it is associated X * with a state that has a non-epsilon out-transition, then the state X * will accept BEFORE it makes that transition, i.e., one character X * too soon X */ X X if ( transchar[finalst[mach]] == SYM_EPSILON ) X accptnum[finalst[mach]] = accepting_number; X X else X { X int astate = mkstate( SYM_EPSILON ); X accptnum[astate] = accepting_number; X mach = link_machines( mach, astate ); X } X } X X X/* copysingl - make a given number of copies of a singleton machine X * X * synopsis X * X * newsng = copysingl( singl, num ); X * X * newsng - a new singleton composed of num copies of singl X * singl - a singleton machine X * num - the number of copies of singl to be present in newsng X */ X Xint copysingl( singl, num ) Xint singl, num; X X { X int copy, i; X X copy = mkstate( SYM_EPSILON ); X X for ( i = 1; i <= num; ++i ) X copy = link_machines( copy, dupmachine( singl ) ); X X return ( copy ); X } X X X/* dumpnfa - debugging routine to write out an nfa X * X * synopsis X * int state1; X * dumpnfa( state1 ); X */ X Xvoid dumpnfa( state1 ) Xint state1; X X { X int sym, tsp1, tsp2, anum, ns; X X fprintf( stderr, "\n\n********** beginning dump of nfa with start state %d\n", X state1 ); X X /* we probably should loop starting at firstst[state1] and going to X * lastst[state1], but they're not maintained properly when we "or" X * all of the rules together. So we use our knowledge that the machine X * starts at state 1 and ends at lastnfa. X */ X X /* for ( ns = firstst[state1]; ns <= lastst[state1]; ++ns ) */ X for ( ns = 1; ns <= lastnfa; ++ns ) X { X fprintf( stderr, "state # %4d\t", ns ); X X sym = transchar[ns]; X tsp1 = trans1[ns]; X tsp2 = trans2[ns]; X anum = accptnum[ns]; X X fprintf( stderr, "%3d: %4d, %4d", sym, tsp1, tsp2 ); X X if ( anum != NIL ) X fprintf( stderr, " [%d]", anum ); X X fprintf( stderr, "\n" ); X } X X fprintf( stderr, "********** end of dump\n" ); X } X X X/* dupmachine - make a duplicate of a given machine X * X * synopsis X * X * copy = dupmachine( mach ); X * X * copy - holds duplicate of mach X * mach - machine to be duplicated X * X * note that the copy of mach is NOT an exact duplicate; rather, all the X * transition states values are adjusted so that the copy is self-contained, X * as the original should have been. X * X * also note that the original MUST be contiguous, with its low and high X * states accessible by the arrays firstst and lastst X */ X Xint dupmachine( mach ) Xint mach; X X { X int i, init, state_offset; X int state = 0; X int last = lastst[mach]; X X for ( i = firstst[mach]; i <= last; ++i ) X { X state = mkstate( transchar[i] ); X X if ( trans1[i] != NO_TRANSITION ) X { X mkxtion( finalst[state], trans1[i] + state - i ); X X if ( transchar[i] == SYM_EPSILON && trans2[i] != NO_TRANSITION ) X mkxtion( finalst[state], trans2[i] + state - i ); X } X X accptnum[state] = accptnum[i]; X } X X if ( state == 0 ) X flexfatal( "empty machine in dupmachine()" ); X X state_offset = state - i + 1; X X init = mach + state_offset; X firstst[init] = firstst[mach] + state_offset; X finalst[init] = finalst[mach] + state_offset; X lastst[init] = lastst[mach] + state_offset; X X return ( init ); X } X X X/* finish_rule - finish up the processing for a rule X * X * synopsis X * X * finish_rule( mach, variable_trail_rule, headcnt, trailcnt ); X * X * An accepting number is added to the given machine. If variable_trail_rule X * is true then the rule has trailing context and both the head and trail X * are variable size. Otherwise if headcnt or trailcnt is non-zero then X * the machine recognizes a pattern with trailing context and headcnt is X * the number of characters in the matched part of the pattern, or zero X * if the matched part has variable length. trailcnt is the number of X * trailing context characters in the pattern, or zero if the trailing X * context has variable length. X */ X Xvoid finish_rule( mach, variable_trail_rule, headcnt, trailcnt ) Xint mach, variable_trail_rule, headcnt, trailcnt; X X { X add_accept( mach, num_rules ); X X /* we did this in new_rule(), but it often gets the wrong X * number because we do it before we start parsing the current rule X */ X rule_linenum[num_rules] = linenum; X X /* if this is a continued action, then the line-number has X * already been updated, giving us the wrong number X */ X if ( continued_action ) X --rule_linenum[num_rules]; X X fprintf( temp_action_file, "case %d:\n", num_rules ); X X if ( variable_trail_rule ) X { X rule_type[num_rules] = RULE_VARIABLE; X X if ( performance_report ) X fprintf( stderr, "Variable trailing context rule at line %d\n", X rule_linenum[num_rules] ); X X variable_trailing_context_rules = true; X } X X else X { X rule_type[num_rules] = RULE_NORMAL; X X if ( headcnt > 0 || trailcnt > 0 ) X { X /* do trailing context magic to not match the trailing characters */ X char *scanner_cp = "yy_c_buf_p = yy_cp"; X char *scanner_bp = "yy_bp"; X X fprintf( temp_action_file, X "*yy_cp = yy_hold_char; /* undo effects of setting up yytext */\n" ); X X if ( headcnt > 0 ) X fprintf( temp_action_file, "%s = %s + %d;\n", X scanner_cp, scanner_bp, headcnt ); X X else X fprintf( temp_action_file, X "%s -= %d;\n", scanner_cp, trailcnt ); X X fprintf( temp_action_file, X "YY_DO_BEFORE_ACTION; /* set up yytext again */\n" ); X } X } X X line_directive_out( temp_action_file ); X } X X X/* link_machines - connect two machines together X * X * synopsis X * X * new = link_machines( first, last ); X * X * new - a machine constructed by connecting first to last X * first - the machine whose successor is to be last X * last - the machine whose predecessor is to be first X * X * note: this routine concatenates the machine first with the machine X * last to produce a machine new which will pattern-match first first X * and then last, and will fail if either of the sub-patterns fails. X * FIRST is set to new by the operation. last is unmolested. X */ X Xint link_machines( first, last ) Xint first, last; X X { X if ( first == NIL ) X return ( last ); X X else if ( last == NIL ) X return ( first ); X X else X { X mkxtion( finalst[first], last ); X finalst[first] = finalst[last]; X lastst[first] = max( lastst[first], lastst[last] ); X firstst[first] = min( firstst[first], firstst[last] ); X X return ( first ); X } X } X X X/* mark_beginning_as_normal - mark each "beginning" state in a machine X * as being a "normal" (i.e., not trailing context- X * associated) states X * X * synopsis X * X * mark_beginning_as_normal( mach ) X * X * mach - machine to mark X * X * The "beginning" states are the epsilon closure of the first state X */ X Xvoid mark_beginning_as_normal( mach ) Xregister int mach; X X { X switch ( state_type[mach] ) X { X case STATE_NORMAL: X /* oh, we've already visited here */ X return; X X case STATE_TRAILING_CONTEXT: X state_type[mach] = STATE_NORMAL; X X if ( transchar[mach] == SYM_EPSILON ) X { X if ( trans1[mach] != NO_TRANSITION ) X mark_beginning_as_normal( trans1[mach] ); X X if ( trans2[mach] != NO_TRANSITION ) X mark_beginning_as_normal( trans2[mach] ); X } X break; X X default: X flexerror( "bad state type in mark_beginning_as_normal()" ); X break; X } X } X X X/* mkbranch - make a machine that branches to two machines X * X * synopsis X * X * branch = mkbranch( first, second ); X * X * branch - a machine which matches either first's pattern or second's X * first, second - machines whose patterns are to be or'ed (the | operator) X * X * note that first and second are NEITHER destroyed by the operation. Also, X * the resulting machine CANNOT be used with any other "mk" operation except X * more mkbranch's. Compare with mkor() X */ X Xint mkbranch( first, second ) Xint first, second; X X { X int eps; X X if ( first == NO_TRANSITION ) X return ( second ); X X else if ( second == NO_TRANSITION ) X return ( first ); X X eps = mkstate( SYM_EPSILON ); X X mkxtion( eps, first ); X mkxtion( eps, second ); X X return ( eps ); X } X X X/* mkclos - convert a machine into a closure X * X * synopsis X * new = mkclos( state ); X * X * new - a new state which matches the closure of "state" X */ X Xint mkclos( state ) Xint state; X X { X return ( mkopt( mkposcl( state ) ) ); X } X X X/* mkopt - make a machine optional X * X * synopsis X * X * new = mkopt( mach ); X * X * new - a machine which optionally matches whatever mach matched X * mach - the machine to make optional X * X * notes: X * 1. mach must be the last machine created X * 2. mach is destroyed by the call X */ X Xint mkopt( mach ) Xint mach; X X { X int eps; X X if ( ! SUPER_FREE_EPSILON(finalst[mach]) ) X { X eps = mkstate( SYM_EPSILON ); X mach = link_machines( mach, eps ); X } X X /* can't skimp on the following if FREE_EPSILON(mach) is true because X * some state interior to "mach" might point back to the beginning X * for a closure X */ X eps = mkstate( SYM_EPSILON ); X mach = link_machines( eps, mach ); X X mkxtion( mach, finalst[mach] ); X X return ( mach ); X } X X X/* mkor - make a machine that matches either one of two machines X * X * synopsis X * X * new = mkor( first, second ); X * X * new - a machine which matches either first's pattern or second's X * first, second - machines whose patterns are to be or'ed (the | operator) X * X * note that first and second are both destroyed by the operation X * the code is rather convoluted because an attempt is made to minimize X * the number of epsilon states needed X */ X Xint mkor( first, second ) Xint first, second; X X { X int eps, orend; X X if ( first == NIL ) X return ( second ); X X else if ( second == NIL ) X return ( first ); X X else X { X /* see comment in mkopt() about why we can't use the first state X * of "first" or "second" if they satisfy "FREE_EPSILON" X */ X eps = mkstate( SYM_EPSILON ); X X first = link_machines( eps, first ); X X mkxtion( first, second ); X X if ( SUPER_FREE_EPSILON(finalst[first]) && X accptnum[finalst[first]] == NIL ) X { X orend = finalst[first]; X mkxtion( finalst[second], orend ); X } X X else if ( SUPER_FREE_EPSILON(finalst[second]) && X accptnum[finalst[second]] == NIL ) X { X orend = finalst[second]; X mkxtion( finalst[first], orend ); X } X X else X { X eps = mkstate( SYM_EPSILON ); X X first = link_machines( first, eps ); X orend = finalst[first]; X X mkxtion( finalst[second], orend ); X } X } X X finalst[first] = orend; X return ( first ); X } X X X/* mkposcl - convert a machine into a positive closure X * X * synopsis X * new = mkposcl( state ); X * X * new - a machine matching the positive closure of "state" X */ X Xint mkposcl( state ) Xint state; X X { X int eps; X X if ( SUPER_FREE_EPSILON(finalst[state]) ) X { X mkxtion( finalst[state], state ); X return ( state ); X } X X else X { X eps = mkstate( SYM_EPSILON ); X mkxtion( eps, state ); X return ( link_machines( state, eps ) ); X } X } X X X/* mkrep - make a replicated machine X * X * synopsis X * new = mkrep( mach, lb, ub ); X * X * new - a machine that matches whatever "mach" matched from "lb" X * number of times to "ub" number of times X * X * note X * if "ub" is INFINITY then "new" matches "lb" or more occurrences of "mach" X */ X Xint mkrep( mach, lb, ub ) Xint mach, lb, ub; X X { X int base_mach, tail, copy, i; X X base_mach = copysingl( mach, lb - 1 ); X X if ( ub == INFINITY ) X { X copy = dupmachine( mach ); X mach = link_machines( mach, X link_machines( base_mach, mkclos( copy ) ) ); X } X X else X { X tail = mkstate( SYM_EPSILON ); X X for ( i = lb; i < ub; ++i ) X { X copy = dupmachine( mach ); X tail = mkopt( link_machines( copy, tail ) ); X } X X mach = link_machines( mach, link_machines( base_mach, tail ) ); X } X X return ( mach ); X } X X X/* mkstate - create a state with a transition on a given symbol X * X * synopsis X * X * state = mkstate( sym ); X * X * state - a new state matching sym X * sym - the symbol the new state is to have an out-transition on X * X * note that this routine makes new states in ascending order through the X * state array (and increments LASTNFA accordingly). The routine DUPMACHINE X * relies on machines being made in ascending order and that they are X * CONTIGUOUS. Change it and you will have to rewrite DUPMACHINE (kludge X * that it admittedly is) X */ X Xint mkstate( sym ) Xint sym; X X { X if ( ++lastnfa >= current_mns ) X { X if ( (current_mns += MNS_INCREMENT) >= MAXIMUM_MNS ) X lerrif( "input rules are too complicated (>= %d NFA states)", X current_mns ); X X ++num_reallocs; X X firstst = reallocate_integer_array( firstst, current_mns ); X lastst = reallocate_integer_array( lastst, current_mns ); X finalst = reallocate_integer_array( finalst, current_mns ); X transchar = reallocate_integer_array( transchar, current_mns ); X trans1 = reallocate_integer_array( trans1, current_mns ); X trans2 = reallocate_integer_array( trans2, current_mns ); X accptnum = reallocate_integer_array( accptnum, current_mns ); X assoc_rule = reallocate_integer_array( assoc_rule, current_mns ); X state_type = reallocate_integer_array( state_type, current_mns ); X } X X firstst[lastnfa] = lastnfa; X finalst[lastnfa] = lastnfa; X lastst[lastnfa] = lastnfa; X transchar[lastnfa] = sym; X trans1[lastnfa] = NO_TRANSITION; X trans2[lastnfa] = NO_TRANSITION; X accptnum[lastnfa] = NIL; X assoc_rule[lastnfa] = num_rules; X state_type[lastnfa] = current_state_type; X X /* fix up equivalence classes base on this transition. Note that any X * character which has its own transition gets its own equivalence class. X * Thus only characters which are only in character classes have a chance X * at being in the same equivalence class. E.g. "a|b" puts 'a' and 'b' X * into two different equivalence classes. "[ab]" puts them in the same X * equivalence class (barring other differences elsewhere in the input). X */ X X if ( sym < 0 ) X { X /* we don't have to update the equivalence classes since that was X * already done when the ccl was created for the first time X */ X } X X else if ( sym == SYM_EPSILON ) X ++numeps; X X else X { X if ( useecs ) X /* map NUL's to csize */ X mkechar( sym ? sym : csize, nextecm, ecgroup ); X } X X return ( lastnfa ); X } X X X/* mkxtion - make a transition from one state to another X * X * synopsis X * X * mkxtion( statefrom, stateto ); X * X * statefrom - the state from which the transition is to be made X * stateto - the state to which the transition is to be made X */ X Xvoid mkxtion( statefrom, stateto ) Xint statefrom, stateto; X X { X if ( trans1[statefrom] == NO_TRANSITION ) X trans1[statefrom] = stateto; X X else if ( (transchar[statefrom] != SYM_EPSILON) || X (trans2[statefrom] != NO_TRANSITION) ) X flexfatal( "found too many transitions in mkxtion()" ); X X else X { /* second out-transition for an epsilon state */ X ++eps2; X trans2[statefrom] = stateto; X } X } X X/* new_rule - initialize for a new rule X * X * synopsis X * X * new_rule(); X * X * the global num_rules is incremented and the any corresponding dynamic X * arrays (such as rule_type[]) are grown as needed. X */ X Xvoid new_rule() X X { X if ( ++num_rules >= current_max_rules ) X { X ++num_reallocs; X current_max_rules += MAX_RULES_INCREMENT; X rule_type = reallocate_integer_array( rule_type, current_max_rules ); X rule_linenum = X reallocate_integer_array( rule_linenum, current_max_rules ); X } X X if ( num_rules > MAX_RULE ) X lerrif( "too many rules (> %d)!", MAX_RULE ); X X rule_linenum[num_rules] = linenum; X } END_OF_FILE if test 17603 -ne `wc -c <'nfa.c'`; then echo shar: \"'nfa.c'\" unpacked with wrong size! fi # end of 'nfa.c' fi echo shar: End of archive 8 $of 10$. cp /dev/null ark8isdone MISSING="" for I in 1 2 3 4 5 6 7 8 9 10 ; do if test ! -f ark${I}isdone ; then MISSING="${MISSING} ${I}" fi done if test "${MISSING}" = "" ; then echo You have unpacked all 10 archives. rm -f ark[1-9]isdone ark[1-9][0-9]isdone else echo You still must unpack the following archives: echo " " ${MISSING} fi exit 0 exit 0 # Just in case...